Optimized Register Renaming Scheme for Stack-Based x86 Operations
نویسندگان
چکیده
The stack-based floating point unit (FPU) in the x86 architecture limits its floating point (FP) performance. The flat register file can improve FP performance but affect x86 compatibility. This paper presents an optimized two-phase floating point register renaming scheme used in implementing an x86-compliant processor. The two-phase renaming scheme eliminates the implicit dependencies between the consecutive FP instructions and redundant operations. As two applications of the method, the techniques used in the second phase of the scheme can eliminate redundant loads and reduce the mis-speculation ratio of the load-store queue. Moreover, the performance of a binary translation system that translates instructions in x86 to MIPS-like ISA can also be boosted by adding the related architectural supports in this optimized scheme to the architecture.
منابع مشابه
Stack Renaming of the Java Virtual
This study proposes a scheme to map the operand stack of the Java Virtual Machine to hardware registers and evaluates the performance beneets of the proposed scheme. Using the technique of register renaming while mapping the stack to registers , we are able to exploit the inherent parallelism in the instruction stream. The simulation results conducted show an improvement of about 15%-26% for th...
متن کاملImproving Memory Access Performance Using a Code Coalescing Unit
High clock frequencies combined with deep pipelining employed by many of the state-of-the-art processors have forced cache hit accesses to be multi-cycle operations. For many programs, untolerated load latencies account for a signiicant portion of total execution time. In this paper, we present a mechanism called the Code Coalescing Unit (CCU) that can identify and eliminate at run-time several...
متن کاملDynamic Register Renaming Through Virtual-Physical Registers
Register file access time represents one of the critical delays of current microprocessors, and it is expected to become more critical as future processors increase the instruction window size and the issue width. This paper present a novel dynamic register renaming scheme that delays the allocation of physical registers until a late stage in the pipeline. We show that it can provide important ...
متن کاملDelft-Java Dynamic Translation
This paper describes the DELFT-JAVA processor and the mechanisms required to dynamically translate JVM instructions into DELFT-JAVA instructions. Using a form of hardware register allocation, we transform stack bottlenecks into pipeline dependencies which are later removed using register renaming and interlock collapsing arithmetic units. When combined with superscalar techniques and multiple i...
متن کاملSMLNJ: Intel x86 back end Compiler Controlled Memory
This note describes the code generation algorithm used for the Intel x86, introduced in version 110.16. The standard Chaitin graph coloring register allocation cannot be used directly for machines with few registers, as all temporaries wind up being spilled, making for a poor allocation[Cha82]. Thus, for the x86, the conceptual model of the architecture has been extended with a set of memory lo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2007